Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add Decision-Record about removing SFTP extensions #1208

Conversation

paullatzelsperger
Copy link
Contributor

WHAT

Adds a decision record about removing SFTP.

WHY

documenting the decision

FURTHER NOTES

List other areas of code that have changed but are not necessarily linked to the main feature. This could be method signature changes, package declarations, bugs that were encountered and were fixed inline, etc.

Closes # <-- insert Issue number if one exists

@paullatzelsperger paullatzelsperger added the documentation Improvements or additions to documentation label Apr 10, 2024
@paullatzelsperger paullatzelsperger changed the title docs: add Decision-Record about removing SFTP docs: add Decision-Record about removing SFTP extensions Apr 10, 2024
Copy link
Contributor

@ndr-brt ndr-brt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🪓

@florianrusch-zf
Copy link
Contributor

florianrusch-zf commented Apr 11, 2024

At least ZF and Cofinity-X would like to have an FTP Dataplane and we are also working on it together. We would like to propose a working dataplane as a contribution, but as usual other things have more priority on our side at the moment.

The reason why we would like the EDC to support another protocol is that S3 (and also Azure Blob) is a proprietary protocol and therefore does not fulfil the needs of companies that want to exchange big data without being forced to use one of the big cloud providers as the storage provider.

\cc @dr-markus

@paullatzelsperger
Copy link
Contributor Author

paullatzelsperger commented Apr 11, 2024

EDC (upstream) supports other non-proprietary protocols such as Kafka and HTTP and Tx-EDC supports HTTP as default. If you're looking for an open-source non-proprietary alternative to S3, just use MinIO. As far as we can tell, and as far as it concerns us, the compatibility is 100%. And that's even without talking about the technical merits of S3 over (s)FTP.

At this time, I see absolutely no reason why such an sFTP data plane couldn't be implemented under the auspices of either Cofinity-X or ZF in their respective Git respositories and published under their respective namespaces. It could even be another Tractus-X repository, that Cofinity-X and ZF maintain. Conversely, I see no reason why those modules should remain in Tx-EDC.

If you are currently using the SFTP modules, then you already must be creating your own distribution of EDC, because the standard Tx-EDC distributions don't contain the SFTP modules, and haven't for quite some time. Adopting the modules into ZF/Cofinity would not change anything from the status quo, other than the Maven group-id of those modules and it would not prevent (your) EDC from supporting them.

So while I completely understand your point, it is not contradictory to this decision-record, and if I'm brutally honest, wanting the SFTP modules to remain in Tractus-X EDC sounds a lot like "alligator hands".

@florianrusch-zf
Copy link
Contributor

Of course we could "copy/paste" the extensions into our EDC build (which we have already done during testing/development). But we're not just thinking of ourselves. Many small and medium sized businesses won't have a cloud subscription or want to host a full-blown MinIO instance. Most of them have almost no IT department. However, for us as Tractus-X, it makes perfect sense to offer SFTP, which is also an already established standard for data storage/exchange worldwide. Refusing to have this in our "standard" Tractus-X EDC is a big mistake for me and @dr-markus.

Of course I totally agree with you that the current implementation is not sufficiently maintained! But instead of saying "let's get rid of it" or "you can provide it yourselves" we should think about how we can get rid of the maintenance issue and how we can make sure it is maintained as a core Tractus-X EDC protocol in the future.

@paullatzelsperger
Copy link
Contributor Author

paullatzelsperger commented Apr 12, 2024

Hosting MinIO is not harder than hosting (and properly configuring!) an SFTP server, so I fail to see the validity in that argument.
My point is that for almost a year, no-one that uses our standard distribution in Tractus-X could have been using SFTP, because we didn't package it. And we have received exactly 0 requests to that effect. Your mileage may vary, which is why I think that functionality is better suited for a ZF/Cofinity variant.

We absolutely cannot keep those modules in the Tx-EDC code base in their current form, because they are poorly implemented and violate several coding and architectural principles, and they pose a potential security hole.

We discussed removing them already, about a year ago, and the decision back then was to keep them, but to rewrite and improve them, which obviously has not happened. Since it is exclusively you who wants to keep them around, and no-one (including you) has picked up that baton, I can only assume that the priority is not high enough.

I think one year is enough of a grace period.

The "you can provide it yourself" is key aspect and architectural principle of EDC.

@jimmarino
Copy link
Contributor

Of course we could "copy/paste" the extensions into our EDC build (which we have already done during testing/development). But we're not just thinking of ourselves. Many small and medium sized businesses won't have a cloud subscription or want to host a full-blown MinIO instance. Most of them have almost no IT department. However, for us as Tractus-X, it makes perfect sense to offer SFTP, which is also an already established standard for data storage/exchange worldwide. Refusing to have this in our "standard" Tractus-X EDC is a big mistake for me and @dr-markus.

Of course, I totally agree with you that the current implementation is not sufficiently maintained! But instead of saying "let's get rid of it" or "you can provide it yourselves," we should think about how we can eliminate the maintenance issue and ensure that it is maintained as a core Tractus-X EDC protocol in the future.

I don't think that contradicts what this DR proposes. A separate repository can be established in TX to maintain and release the extension properly, which would appear to fulfill your requirements. As it stands now, the current code is not properly maintained or up to the standards required for general use.

I also don't think the use case for small and medium-sized businesses is common. If a business has limited IT staff, it is much cheaper (on the pennies!), less technically challenging, and more secure to use a cloud provider's object storage than to host and properly maintain (S)FTP servers. Properly operating scalable, reliant (S)FTP servers is challenging and typically not something an SMB would invest in.

The core TX-EDC team has limited resources, and we are trying to deploy them judiciously to cover high-priority use cases. Unfortunately, limited resources force us to make choices. The prioritization seems clear based on the fact that the SFTP extensions have not been maintained or included in a TX distribution for a long time, nor have they been proposed as a C-X standard.

@lgblaumeiser
Copy link
Contributor

Hi, @paullatzelsperger @jimmarino,

Actually, we just prepared a contribution to complete the SFTP extension, so that it can be used in a dataplane. It is a planned feature of Dataspace OS and we (Cofinity-X in association with ZF) are willing to maintain the feature in the future. Of course it could be maintained in a different repository, but I would strongly suggest to keep it where it is to simplify the consistency of the extension.

Btw., Minio is in productive use a limited option, due to legal issues and efforts to fulfill legal constraints.

@paullatzelsperger
Copy link
Contributor Author

paullatzelsperger commented Apr 12, 2024

keep it where it is to simplify the consistency of the extension.

I'm sorry, this makes no sense to me. Why is that simpler? If Cofinity-X/ZF rolls their own distribution of EDC, it does not matter where the extension is hosted. I said it once, and will say it again: the official Tx-EDC distribution and repository can only contain officially sanctioned and standardized C-X extensions, which SFTP is not.
Tx-EDC is not a toolbox of random extensions, but a reference implementation. I thought that was understood.

willing to maintain the feature in the future

Please understand that at this point I can regard this as lip service at best b/c past experience is indication to the contrary. Of course, ZF/Cofinity is welcome to provide a PR with a polished SFTP implementation in the future, and it will have to go through the usual acceptance and review process, but at present, we cannot keep the SFTP modules.

@lgblaumeiser
Copy link
Contributor

Tx-EDC is not a toolbox of random extensions, but a reference implementation. I thought that was understood.

If the Tractus-X repository is dedicated for the task of providing the "official" distribution of the Tractus-X EDC, I understand and support the deletion of the piece. On the other hand, I would see the necessity to have a place to share quality extensions within the community, so that end users can make use of the feature easily. What do you propose as place for something like that? Does this already exist?

Actually, with simpler I meant, that an own repository comes with additional efforts in setting the repository up to provide the necessary quality infrastructure, visibility is decreased, so people who might need the feature won't find it, and finally it is an additional repository to be maintained by the overall project infrastructure.

@paullatzelsperger
Copy link
Contributor Author

paullatzelsperger commented Apr 12, 2024

This could very well be a separate tractusx repo, that is maintained by ZF/Cof. It is very easy to set that up. The EF has a tool called Otterdog where additional repos can be requested, and we even provide a template repo for that exact purpose, that comes complete with all required legal documents, quality processes etc. It does not get any easier than that.

In addition, users can just use maven artifacts from maven central, so there is no need to use the repo directly and build from source. Thus, visibility is not impacted at all.

[edit] we will face this same situation regularly going forward. I like the idea of an "EDC community extensions" repo

@jimmarino
Copy link
Contributor

Tx-EDC is not a toolbox of random extensions, but a reference implementation. I thought that was understood.

If the Tractus-X repository is dedicated for the task of providing the "official" distribution of the Tractus-X EDC, I understand and support the deletion of the piece. On the other hand, I would see the necessity to have a place to share quality extensions within the community, so that end users can make use of the feature easily. What do you propose as place for something like that? Does this already exist?

Actually, with simpler I meant, that an own repository comes with additional efforts in setting the repository up to provide the necessary quality infrastructure, visibility is decreased, so people who might need the feature won't find it, and finally it is an additional repository to be maintained by the overall project infrastructure.

A place to share quality extensions or a directory would be a very useful thing to have. We could do something similar to upstream EDC's Known Friends Directory.

We already have a Github template project that does most of the heavy lifting to set up an extension repository. We could continue to develop the template project to automate this process even further.

@lgblaumeiser
Copy link
Contributor

That sounds good, we will discuss the approach and move forward. Thanks, Paul, Jim.

@florianrusch-zf
Copy link
Contributor

For me and @dr-markus it totally makes sense to provide sftp as a standard protocol in the tractus-x edc.

the official Tx-EDC distribution and repository can only contain officially sanctioned and standardized C-X extensions, which SFTP is not.

@paullatzelsperger Where is it documented which EDC extensions are officially standardised by C-X and where is the process documented for proposing an extension to be officially standardised by C-X? And who decides on this?

In the past it was (more or less) decided by us developers or not?

Please don't misunderstand my comments - I agree with you that we should get rid of unused code! I just want to clarify that there is a need for SFTP and that we should clarify how we can proceed with including SFTP as a official supported protocol in the Tractus-X EDC. So for me it would be perfectly fine if we get rid of the existing code if we have a plan for the future.

@paullatzelsperger
Copy link
Contributor Author

paullatzelsperger commented Apr 15, 2024

I think a good way forward would be to spin up a community or "contrib" repository as outlined in earlier comments, where people can develop and maintain their extensions, and if a strong need arises to adopt one into the standard distribution of EDC, we decide that on a case-by-case basis. We can even do so without moving any code, provided that a Maven publication exists.
I do think that certain acceptance criteria and rules have to be established for that.

As for standards, they were buried deep in some Confluence page, but that may have changed now. They should, however, define interop such as DSP and IATP plus ODRL policy expressions, and I would be hesitant to want to mandate the use of a particular dataplane technology as that wouldn't contribute to interop.

@jimmarino
Copy link
Contributor

For me and @dr-markus it totally makes sense to provide sftp as a standard protocol in the tractus-x edc.

the official Tx-EDC distribution and repository can only contain officially sanctioned and standardized C-X extensions, which SFTP is not.

@paullatzelsperger Where is it documented which EDC extensions are officially standardised by C-X and where is the process documented for proposing an extension to be officially standardised by C-X? And who decides on this?

In the past it was (more or less) decided by us developers or not?

If you want to standardize SFTP, you should join the CX Standardization Working Group and discuss it there. Keep in mind that standardization would require all implementations of the Catena-X wire protocol (including EDC) and all participants to support the feature, so the bar for standardization is high.

@paullatzelsperger paullatzelsperger force-pushed the docs/decision_record_sftp branch 2 times, most recently from 5a8c150 to 58e98b6 Compare April 15, 2024 09:28
Copy link

sonarcloud bot commented Apr 15, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@paullatzelsperger paullatzelsperger merged commit d14591a into eclipse-tractusx:main Apr 15, 2024
31 checks passed
@paullatzelsperger paullatzelsperger deleted the docs/decision_record_sftp branch April 15, 2024 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
Status: Merged
Development

Successfully merging this pull request may close these issues.

6 participants